NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A new perspective on denoising based on optimal transport

https://doi.org/10.1093/imaiai/iaae029

García_Trillos, Nicolás; Sen, Bodhisattva (September 2024, Information and Inference: A Journal of the IMA)

Abstract In the standard formulation of the classical denoising problem, one is given a probabilistic model relating a latent variable $$\varTheta \in \varOmega \subset{\mathbb{R}}^{m} \; (m\ge 1)$$ and an observation $$Z \in{\mathbb{R}}^{d}$$ according to $$Z \mid \varTheta \sim p(\cdot \mid \varTheta )$$ and $$\varTheta \sim G^{*}$$, and the goal is to construct a map to recover the latent variable from the observation. The posterior mean, a natural candidate for estimating $$\varTheta $$ from $$Z$$, attains the minimum Bayes risk (under the squared error loss) but at the expense of over-shrinking the $$Z$$, and in general may fail to capture the geometric features of the prior distribution $$G^{*}$$ (e.g. low dimensionality, discreteness, sparsity). To rectify these drawbacks, in this paper we take a new perspective on this denoising problem that is inspired by optimal transport (OT) theory and use it to study a different, OT-based, denoiser at the population level setting. We rigorously prove that, under general assumptions on the model, this OT-based denoiser is mathematically well-defined and unique, and is closely connected to the solution to a Monge OT problem. We then prove that, under appropriate identifiability assumptions on the model, the OT-based denoiser can be recovered solely from information of the marginal distribution of $$Z$$ and the posterior mean of the model, after solving a linear relaxation problem over a suitable space of couplings that is reminiscent of standard multimarginal OT problems. In particular, due to Tweedie’s formula, when the likelihood model $$\{ p(\cdot \mid \theta ) \}_{\theta \in \varOmega }$$ is an exponential family of distributions, the OT-based denoiser can be recovered solely from the marginal distribution of $$Z$$. In general, our family of OT-like relaxations is of interest in its own right and for the denoising problem suggests alternative numerical methods inspired by the rich literature on computational OT.
more » « less
Full Text Available
Permuted and Unlinked Monotone Regression in R^d: an approach based on mixture modeling and optimal transport

Slawski, Martin; Sen, Bodhisattva (July 2024, Journal of machine learning research)

Full Text Available
On Efficient and Scalable Computation of the Nonparametric Maximum Likelihood Estimator in Mixture Models

Zhang, Yangjing; Cui, Ying; Sen, Bodhisattva; Toh, Kim-Chuan (January 2024, Journal of machine learning research)

In this paper, we focus on the computation of the nonparametric maximum likelihood es- timator (NPMLE) in multivariate mixture models. Our approach discretizes this infinite dimensional convex optimization problem by setting fixed support points for the NPMLE and optimizing over the mixing proportions. We propose an efficient and scalable semis- mooth Newton based augmented Lagrangian method (ALM). Our algorithm outperforms the state-of-the-art methods (Kim et al., 2020; Koenker and Gu, 2017), capable of handling n ≈ 106 data points with m ≈ 104 support points. A key advantage of our approach is its strategic utilization of the solution’s sparsity, leading to structured sparsity in Hessian computations. As a result, our algorithm demonstrates better scaling in terms of m when compared to the mixsqp method (Kim et al., 2020). The computed NPMLE can be directly applied to denoising the observations in the framework of empirical Bayes. We propose new denoising estimands in this context along with their consistent estimates. Extensive nu- merical experiments are conducted to illustrate the efficiency of our ALM. In particular, we employ our method to analyze two astronomy data sets: (i) Gaia-TGAS Catalog (Anderson et al., 2018) containing approximately 1.4 × 106 data points in two dimensions, and (ii) a data set from the APOGEE survey (Majewski et al., 2017) with approximately 2.7 × 104 data points.
more » « less
Full Text Available
Inference for Local Parameters in Convexity Constrained Models

https://doi.org/10.1080/01621459.2022.2071721

Deng, Hang; Han, Qiyang; Sen, Bodhisattva (October 2023, Journal of the American Statistical Association)

Full Text Available
Multivariate Rank-Based Distribution-Free Nonparametric Testing Using Measure Transportation

https://doi.org/10.1080/01621459.2021.1923508

Deb, Nabarun; Sen, Bodhisattva (January 2023, Journal of the American Statistical Association)

Full Text Available
Semiparametric Efficiency in Convexity Constrained Single-Index Model

https://doi.org/10.1080/01621459.2021.1927741

Kuchibhotla, Arun K.; Patra, Rohit K.; Sen, Bodhisattva (January 2023, Journal of the American Statistical Association)

Full Text Available
Two-Component Mixture Model in the Presence of Covariates

https://doi.org/10.1080/01621459.2021.1888739

Deb, Nabarun; Saha, Sujayam; Guntuboyina, Adityanand; Sen, Bodhisattva (October 2022, Journal of the American Statistical Association)

Full Text Available
Multivariate ranks and quantiles using optimal transport: Consistency, rates and nonparametric testing

https://doi.org/10.1214/21-aos2136

Ghosal, Promit; Sen, Bodhisattva (April 2022, The Annals of Statistics)

Full Text Available
High-dimensional asymptotics of likelihood ratio tests in the Gaussian sequence model under convex constraints

https://doi.org/10.1214/21-AOS2111

Han, Qiyang; Sen, Bodhisattva; Shen, Yandi (February 2022, The Annals of Statistics)

Full Text Available
Multivariate extensions of isotonic regression and total variation denoising via entire monotonicity and Hardy–Krause variation

https://doi.org/10.1214/20-AOS1977

Fang, Billy; Guntuboyina, Adityanand; Sen, Bodhisattva (April 2021, The Annals of Statistics)
null (Ed.)
Full Text Available

« Prev Next »

Search for: All records